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ABSTRACT 

The purpose of this study was to evaluate the 
effectiveness of using performance and portfolio assessment 
techniques to diversify assessment in a minority teacher education 
program at Tuskegee University (Alabama). Both perfoimance and 
portfolio assessments served as the exit examination. These 
techniques were used for three reasons: a state traditional 
certification examination was thrown out by the Alabama courts 
because it had negative impact on minority students; the university 
serves a predominantly minority population; and traditional tests 
have not been very valid measures of performance tasks such as 
teaching. Grades from portfolio and performance assessments were 
obtained from the files of 30 graduates of the teacher education 
program. These were correlated with grades in methods courses* 
foundation courses, and grade point average (GPA) . Significant 
correlations were found between methods classes, on the one hand, and 
performance measures and portfolios on the other. Though the 
performance assessment measure was significantly correlated to GPA, 
no relationship was found between portfolio and GPA. Also, the 
portfolio score was not significantly correlated to the performance 
measure. The results suggested that performance and portfolio 
measures elicit skills and abilities that are relatively independent 
of those elicited by traditional tests as represented by the GPA and 
are independent of each other. It was concluded that though 
performance and portfolio -measures show good promise for diversifying 
assessment for teacher certification, portfolio assessment itself 
needs to be improved. The appendix provides portfolio criteria from 
the Educational Testing Service Praxis Series of Assessments. 
(Contains 24 references.) (ND) 
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Abstract 

The purpose of this study was to evaluate the effectiveness of using performance and portfolio 
assessment techniques diveisify assessment in a minority teacher ethicadon program. Performance and 
portfolio assessments served as otu* exit examination. These techniques were used for three reasons: a 
state traditional certification examir>'ation was thrown out by the Alabama courts because it had negative 
impact on minority students; we serve a predominantly minoiity population; and teaching is a perfonnanoe 
task and traditional tests have not been very valid measures of peiformance tasks such as teaching. Data 
weie obtained from the files of 30 graduates of tire teacher education program. Students' grades from 
portfolio and performance assessments were correlated with grades in methods courses, foundation 
courses, and GPA. There were significant correlations between methods classes, on the one hand, and 
performance measure (r (30) = 0.41, g < ,05) and portfolio (£ (30) = 0.32. < .05), on the other. 

Though the pcrformaiKe assessment measure was significantiy correlated to GPA (r (30) = 0.40). there 
was no relationship between portfolio and GPA (r (30) = 009). Also, ponfolio score was not 
significantly correlated to the performance measure. The results suggested that peiformance and portfolio 
measures elicit skills and abilities that are relatively independent of those elicited by traditional tests as 
represented by the GPA and are independent of each other. Inrer-iater reliabiltqr coefficients on poitfolto 
assessment, based on individual Praxis III Series criteria, ranged from ^ (25) = 0.0 to rxx(25) = 0-38. 
It was concluded that though performance and portfolio measures show good promise for diversifying 
assessment for teacher ceitification. portfolio assessment itself needs to be improved. Suggestions for 
improvement were made. 
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Diversity in Teacher Assessmenc 
\Vhat‘s Working. W'hat's Not? 

The puipose of this study was to deteimine to what extent portfolio assessment has helped achieve 
the goal of assessment diversification in a teacher education exit examination system. 

Perspectives and Theoretical Framework 

Teacher evaluation based on standardized one>shot objeaive examination is now widelv 
recognized as insensitive to a host of diverse and important teacher education attributes and contextual 
variables. Teaching is recognized as complex, holistic, highly integrated and contextualized. 
(Athanases. 1990; Barton & Collins. 1993; Dwyer. 1993). Thus, standardized pencil-and-paper tests 
appear not to be appropriate, authentic, or valid enough to use for decisions regarding teaching 
performance, certification and promotion (Chittenden. 1991; Stiggins. 1986; Wiggins. 1989). In 
addition to being inappropriate for use with teaching, standardized teacher examinations tend to have 
negative impact on minorities (Bredekamp & Shepard. 1989). 

The argument against traditional approaches to teacher assessment was strengthened in recent 
years by the courts' rejection of many state-developed teacher certification examination. The failure rate 
of blacks and other minorities on such examinations, in Alabama, was disproportionate to that of whites. 
Furthermore, these examinations often are said to lack criterion-related validity (Bredekamp & Shepard. 
1989). Other authors dealing with younger populations have made similar observations about traditional 
tests in general. For example. Richert and McDonnel ( 1982) in the National Report on IdemiFication 
suggest that the use of paper-and-pencil tests leads to the exclusion of minorities from gifted programs. 
Also. Shaklee (1992) suggested that performance -based assessments are valid approaches for 
documenting the potential of under-represented groups in gifted programs. According to Chittenden 
(1991). authentic assessments may more accurately reflect holistic approaches to teaching than traditional 
ones. Thus, given the population we serve (which is predominantly black), and the purpose of testing 
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(which is demonstiaxion of teaching competence) it made sense to use a performance-type KSt for our exit 
examimion, ladier dian a traditional test 

In an attempt to provide more varied, valid and authentic measures of teaching competence, 
researchers and practitioneis have revived interest in old techniques long ignored, and developed new 
ones, including: journals, portfolios, and various forms of performance-based methods such as 
classroom observations. Use of these techniques in evaluating teaching competence has spawned 
hundreds of journal articles and books on performance assestnent in general and portfolio assesment in 
particular. Regardless of their increasing popularity, however, rttany of the techniques have serious 
weaknesses in objectivity, reliability, validity and genetaliTability (Lirm. 1993.) 

Consistent with a national trend toward performance-based assessment. Educational Testing 
Service (ETS). developers of the widely-used National Teachers Examination (NTE). developed the 
Praxis Series 111: Classroom Performance Assessments which are based on several years of research on 
the conceptualization of teaching as a highly integrative, productive, and complex activity (Dwyer. 19931. 
Teaching emails engaging students as active learners to induce changes in their previous knowledge — a 
constructivist view of learning. In the constructivist theory tradition, learning is viewed as an active 
process of construaing meaning. Thus, teacher assessments should reflect a diversity of classroom 
contexts (including differences in subject matter, students' backgrounds and styles of leafing). Given the 
complexity of this approach to teacher assessment, we need trained, sensitive evaluators. One example of 
such an approach is found in the Praxis 111: Classroom Petformance Assessments , whose assessors must 
paitidpate in five days of training and pass a proficiency test before certification of proficiency by ETS. 

ETS is not alone among teacher evaluation organizations in their development of criteria for 
demonstrating quality teaching. Other groups include The Natirmal Council for the Accreditation of 
Teacher Education (NCATE). the National Board for Professional Teaching Standards, the National 
Policy Board for Educational Admintstration (work led by practicing educators), the Interstate New 
Teacher Assessment and Support Consoitium (work led by the nation's chief state school officers, and 
many state depaitments of education, include performatKe criteria and standards in their assesment of 
teachers. Many states have also used or are using performance assesment in grades K-12. for example: 
Vermont (Koretz, et al, 1994); Connecticut; California, and New York (Baron, 1991). Some benefits 
repotted in these sites include more equitable evaluation of minorities than with traditional tests, divergent 
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thinking, cooperative learning, cieaiivity. and inieipendence among team members. Performance tests 
also have been reported to facilitate equal learning among boys and girls and among various ethnic 
groups. In small groups, minorities who normally remain segregated in regular classrooms cease to be 
minority, according to Mangani tas cited in Baron. 1991). 

Portfolio assessment often is include.! in perfbrmance-'based assessment programs. Portfolios 
are used for various purposes. Because of portfolios' face validity (Barton & Collins. 1993). many 
educators use them as substitutes for traditional tests (Black. 1990; Dagvarian. 1989; Woodrow. 1982). 
while others use them as supplements (Bird. 1990; Forrest. 1990; Nweke. 1990; O'Brien. 1990; Wolf. 
1990). Portfolio technique is also used to assess general education (Hunter. 1989). and as 
supplementary resume' (Nweke. 1990). 

How Ponfolio Assessment is Used at Tuskegee University to Address Preservice Teacher Diversity: 
Methods 

Since 1993. teacher education programs in Alabama are required by the State Department of 
Education to develop their own exit examinations and administer them to their students. This policy 
followed a lengthy legal banie regarding the state's previous state -developed pencilnand-paper certitlcation 
e.xamination which, according to the courts, had a tKgative impaa on blacks. 

In developing our exit examination. Tuskegee University decided that simply constructing another 
pencil-and-paper test would not adequately address diversity issues, especially in view of our 
predominantly-black student body. Other reasons for our decision were the controversy surrounding 
the state's pencil-and-paper test of teacher competency, the large number of penciFand-paper tests which 
students tale in their college courses, and the emerging d.ita regarding performance-based and portfolio 
assessment. Our solution was what we call the Comprehensive Examination . The Comprehensive 
Examination is in two pans: Pan 1 is a performaiKe-based evaluation. Pan 2 is a portfolio. Parts 1 and 2 
use the same Praxis Series criteria for evaluation (See Appendix). The rating form for Part 1. the 
performance-based evaluation, is completed collaboratively by a student's cooperating teacher and 
university supervisor, at the end of a student’s internship, and is designed to assess the student's 
classroom performance and ability to integrate theory and practice. Students' ratings are based on their 
performance during student teaching over a period of ten weeks. 
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The Poitfolio. Pan 2. is developed by student teachers to demonstrate that each Praxis criterion is 
met. The poitfolio assessment allows students to show, in diverse ways, that they have met each 
criterion and that they can integrate theoiy' and practice. Each portfolio is evaluated by two to four 
readers, comprising the students' cooperating teachers, university supervisois. and two other Tuskegee 
University professors. A poitfolio grade is the average of scores from all raters. *1110 portfolio and 
performance test technically are more in accord, than traditional tests, with our newly-developed and 
implemented "Constructivist Reflective’’ model in the School of Education. In this model, we encourage 
personal and reflective responses from students, throu^ journal writing, personal histories, and 
reflection on lessons taught. We chose the poitfolio and performance techniques for the exit exam 
because, for our purposes, they better lev’eal a student s development toward becoming a teacher and 
because, at least conceptuaUy. they seem to hive higher criterion-related validity than more traditional 
tests. Such methods also recognize that attainment of a performance outcome — for example, "building 
instruction on students’ prior learning and academic strengths'' — can best be verified through 
observ’ation or performance assessment In the next section the method for the study is described. 

Data Source 

Data were obtained from the fdes of 3C secondai)-. elementary and early childhood smdents who 
graduated from Tuskegee Univeisity between 1993 and 1995. Specifically, students' grades on 
portfolios, performance assessments. GPAs. methods, and foundations courses were obtained. 

Efoitfolio and classroom performance grades were correlated with students' overall GPA. and a\ erage 
grades on foundations and methods courses. The GPA was computed on all courses except for student 
teaching and portfolios. Imerrater leliability among cooperating teachers and teacher education faculty 
was also investigated. 

Results 

The classroom perfonnance assessment showed a positive and significant correlation with overall 
grade point average (GPA) (il30) = 0.399, p < .05) and with methods couises (tL30) = 0.4 13. p < .05). 
It showed no significant relationship, however, with portfolio measure U1.30) = .2 1 9 . p > .05 ) and 
foundation courses (i(30) = 0. 1 62). p > .05). (See Table 1 ). 
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The ponfolio measure showed a significant relationship with methods classes (i(30) = 0.39. jl< 
.05). However it had no relationship with GPA (i(30) = 0.087. p > .05). and a positive but 
nonsigificant relationship with foundation courses (r(30) = . 150. p > .05 and the Classroom Performance 
Assesment (r(30) = .219. p > .05). 

It was also found that the inter'iater leliability coefficients between cooperating teachers and 
teacher education faculty ueie very low. As shown in Table 2. these coefficients range from a low of 
Lxx= 0.0 to a high of Hx = 0.38. with a median reliability coefficient of £xx= 0.09. All inter^ter 
leliability figures between cooperating teachers and faculty raters were zero or tKgadve. except one. 
Inter-rater correlations were higher among the teacher education faculty. The lowest was ixx = .02 and 
the highest was rxx = 0.81. w ith a median of ixx ==0.3 1 . 

Discussion 

These results suggest that portfolio assessment elicits different skills and abilities from those 
measured by traditional tests. Therefore, portfolios show promise for diversify ing measures used in 
teacher certification decisions. 

Interestingly, the one positive conelation between cooperating teachers and faculty raters is 
between the faculty member whose ratings correlated negatively writh most of the other faculty raters. 

The low inter-rater coefficients are a cause for concern. The first (Question they raise is: Wliv are 
the correlations so low? Low to medium inter-rater reliability coefficients are not uncommon in the area of 
ponfolio assessment (Koretz. et al. 1992; Nystrand et al. 1993). An average reliability figure of 0.43 was 
repotted on separate areas of writing portfolios in the Vermun portfolio assessment program (Koretz et 
al. 1992). The a.erage reliability increased to 0.58 when computed on total scores. There are some 
possible explanations for the low reliability coefficients. The first is generosity or leniency error. There 
were indications that cooperating teachers were trying to be nice and to "help out" their student teachers. 

In these cases, the veibal and written comments from the cooperating teachers did not match the perfect or 
near-perfea grades they awarded the student interns. 

There were evidences of inadequate comprehension of some criteria. For example. Criterion 1.2 
under Content Knowledge for Teaching appeared to have been misundeistood. It states that "teaching 
intern demonstrates an understanding of the coimections between the content of an instiuaional event and 
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what was studied previously or icmains to be studied in the future." The criterion is really asking for 
evidence that the student teacher understands the scope and sequence of content uught. Entries under 
Criterion 1.2 were not always satisfactoiy. 

A third e.vplanation for low inter-rater reliability coefficients is nesmetion of range. The student 
interns are a highly homogeneous group due to prior selection. To qualify for student inteniship students 
would have taken and passed all required courses with a GPA of 2.5 or better. They also had to have 
peformed satisfactorioly at an interview. Thus, it would not be unusual for the interns to make sinular 
grades on the performance and portfolio assessments. Also, die leliability figures were based on ratings 
on a scale of I -5 assigned to individual criteria radier than total portfolio score. For e.vampie. a faculty' 
member awarded a score of 94 to a Portfolio, while the cooperating teacher awarded 9S. and yet the inter- 
rater reliability between the two raters was 0.09! 

The final explanation is lack of training. Neither the teacher education faculty nor cooperating 
teacheis received any formal training in portfolio or performance assessment Training will be conducted. 

Educational Impottance of Study 

Mote and mote, teacher education programs use ponfolios as part of their assessment systems. 
Teacher educatois are seeking understanding of the rok of portfolios in teacher education and the 
relationship of ponfolios to more traditional assessment techniques. Educators especially want to know 
what specific skills and abilities portfolios measure, how valid they are as a measure of teacher 
competence, and their adequacy in addressing the issue of diveisity. This study adds to the grow ing 
knowledge base of ponfolio assessment in teacher education regarding the relationship of ponfolios to 
other evaluation measures, and common problems in implementing portfolio assessment. 
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Appendix 
Portfolio Criteria' 



1.0 CONTENT KNOWLEDGE FOR TEA.CHING 

1 . 1 Professional teaching intern demonstrates knowledge of content 

through instructional events diat are logically sequenced and that are sound and 
accurate reflections of the content 

1 . 2 Professional teaching intern demonstrates an understanding of the 
cotuiections between the content of an instructional event and what was 
studied previously or remains to be studied in the future. 

1 .3 Professional teaching intern can create or select curricular materials and other 
resources, learning activities, and evaluation strategies that are clearly linked to the 
intent or goal of the instrucdonal event. 

2.0 TEACHING FOR STUDENT LEARNING 



2. 1 Professional teaching intern builds instruction on students' 
prior learning and academic strengths. 



2.2 Professional teaching intern accommodates students’ individual 

'interests, developmorital levels, and cultural resources by engaging them in a 
variety of learning activities. 

2.3 Professional teaching inttm monitors students' understanding of 
content through a variety of means, providing feedback to students to assist 
learning, and adjusting learning activities as the situation demands. 



2.4 Professional teaching intern makes expectations clear to students, setting high 

expectations for all. and helping students take responsibility for their own learning. 

2 . 4 Professional teaching intern makes content comprehensible to 
students through clear and focused explanatims. and rrteaningful 
examples, analogies, metaphors, and/or demonstrations. 

2.5 Professional teaching intern encourages students to extend their 
thinking beyond factual knowledge. 

3.0 CLASSROOM COMMUNITY FOR STUDENT LEARNING 



3.1 To the extent possible in this classroom, professional teaching 
intern creates a purposeful and well-functioning learning 
commuitity with convenient and well-understood classroom 
routines. 



3.2 To the extent possible in this classroom, professional teaching 
imem creates an attractive and safe physical environment 
arranged in ways conducive to student learning. 



' ' The criteris «re from Educilionil Testing Service's Praxis Series of assessments. The ETS criteri'i were published in ETS 
Policy Notts, Volume 3. Number 2, Spring 1991. 
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3.3 To the extent possible in this classroom, professional teach ing 
intern makes standards of behavior and cons^uences of 
misbehavior clear to students, and bandies disruptions 
efficiently and with lespea. 

3 .4 To the extent possible in this classroom, professional teach ing 

intern creates a classroom climate that ensures equity and respect for and among 
students. 

3 .3 To the extent possible in this classroom, professional teaching 
intern establishes and m^tains rapport with students. 

3 .6 To the extent possible in this classroom, professional teaching 

intem communicates high expectations for the learning and 
behavior of all students. 

4.0 TEACHER PROFESSIONALISM 

4. 1 The professional teaching intern is able to reflect on and analyze his/her own 
instruction; charactenze successes and failures; identify actions taken and rationales 
for them; and determine the extent to which instructional goals are met. 

4.2 Professional teaching intern is able to explain how insights gained from 
instructional experiences can be used to improve instruction. 

4.3 Professional teaching intern demonstrates personal responsibility for student 
learning. 

4.4 To the extent possible in this seiting, professional teaching intern is 
able to build professional relationships with colleagues to share 
teaching insights and coorduute learning activities for students. 

4.5 To the extent possible in this setting, professional teaching intern Ls 
able to communicate with parents regarding student learning, and,where 
appropriate, interact effectively with the conununity. 
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Tablet 



Correlation AmoneTraditionaland Non- Traditional Measuiies 



Grade Point 
Average (GPA 


GPA 


FDN 


Methods 


STTCHG 


Foundation 

(FDN) 


0.738* 








Methods 

Courses 


0.666* 


0.655* 






Student 

Teaching 

(STTCHG) 


0.399* 


0.162 


0.413* 




Portfolio 


0.087 


0.150 


0.320* 


0.219 



n = 30. * = significant at 0.05 level 
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Table 2 

Inter-Rater Reliability Coefficients 



Rater 1 


R1 


R2 


R3 


R4 


Rater 2 






.31 




Rater 3 










Rater 4 










Rater 5 


.13 


.64 


.31 


.57 


Rater 6 










Rater 7 


-.02 

-.10 








Rater 8 








.23 


Rater 9 




.61 






Rater 

10 


-.04 








PSchl 


.38 









R5 R6 R7 R8 R9 



.02 

.76 

.56 .23 

.62 

.18 

.32 

.62 

.81 

.77 .66 .31 .47 .57 

.18 

.45 



-.19 

-.08 



n = 25 items 

range of scores per item is 1 -5 



-.04 .00 

-.40 
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